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DETAILED ACTION 
Remarks 

1 . This communication is responsive to the amendment dated September 30 th , 
2008. In the amendment dated September 30 th , 2008, Claims 1-4, 8-10, 14-20, and 22 
are pending, Claims 1 -4, 8, 1 0, and 1 6-20 are amended, Claims 5-7, 11-13, and 21 are 
canceled, and Claims 1 , 14, 17, and 22 are independent Claims. The examiner 
acknowledges that no new matter was introduced and the amended claims are 
supported by the specification. This action is made FINAL. 

Response to Arguments 

2. Applicant's arguments dated September 30 th , 2008 with respect to Claims 1-4, 8- 
10, 14-20, and 22 have been considered but are not persuasive. 

3. As to Applicant's arguments with respect to Claims 1-4, 8-10, 14-20, and 22 for 
the prior art(s) allegedly not teaching or suggesting "compressing each of the seven 
supersamples to sixteen bits of precision," the examiner respectfully disagrees. 
Powell, col. 3, lines 35-48 with Sharangpani, col. 1, lines 22-27 with Broder, col. 9, lines 
1 1-15 was used to reject this claimed limitation below. Specifically, Powell, col. 3, lines 
35-48 teaches that a signature (functionally the claimed supersample) can be 16 to 32 
bits long. The citings in Sharangpani and Broder help to incorporate the teaching in 
Powell into the combination of the references by showing that the 16 bits is obtained by 
a compression technique and how Powell is related to the fingerprints of Broder. As 
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such the prior art(s) appear to teach "compressing each of the seven supersamples to 
sixteen bits of precision." 

4. In response to applicant's argument that there is no suggestion to combine the 
references, the examiner recognizes that obviousness can only be established by 
combining or modifying the teachings of the prior art to produce the claimed invention 
where there is some teaching, suggestion, or motivation to do so found either in the 
references themselves or in the knowledge generally available to one of ordinary skill in 
the art. See In re Fine, 837 F.2d 1071 , 5 USPQ2d 1596 (Fed. Cir. 1988) and In re 
Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992). In this case, teaching, 
suggestion, or motivation comes from the references themselves and knowledge 
generally available to one of ordinary skill in the art (see rejections below). 

5. Any other claims argued merely because of a dependency on a previously 
argued claim(s) in the arguments presented to the examiner, September 30 th , 2008, are 
moot in view of the examiner's interpretation of the claims and art and are still 
considered rejected based on their respective rejections from at least a prior Office 
action (part(s) of recited below). 

Response to Amendment 
Claim Objections 

6. In light of the applicant's respective arguments or respective amendments, some 
previous claim objection(s) to the claims have been withdrawn. However, new 
objection(s) are warranted by the amended claims. 
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7. Claim 1 is objected to because of the following informality: 

a. Claim 1 is still does not appear to be indented properly according to 37 
C.F.R 1 .75(i) or MPEP 608.01 (i)(i). 
Appropriate correction is required. 

Claim Rejections - 35 USC § 103 

8. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

9. Claims 1 -4, 8-1 0, and 1 7-20 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over U.S. Patent No. 6,349,296 (Broder et al.) in view of U.S. Patent No. 
6,658,423 (Pugh et al.) (or alternatively, Applicant admitted prior art (AAPA) for a 
limitation), in view of U.S. Patent No. 6,058,410 (Sharangpani), further in view of U.S. 
Patent No. 5,721,788 (Powell et al.). 

For Claim 1, Broder teaches: "A method for detecting similar objects in a 
collection of such objects, [Broder, col. 4, lines 6-15 with Broder, Fig. 3] the method 
comprising: 

• processing a query to produce the collection of objects; [Broder, col. 1 1 , lines 8- 
1 1 with Broder, col. 1 1 , lines 20-23] 

• ...and, for each of two objects: 
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• modifying a previous method for detecting similar objects [Broder, col. 4, lines 6- 
15 with Broder, Fig. 3] ...wherein the modifying comprises: 

• ...requiring a number of matching supersamples out of the seven 
supersamples in order to conclude that the two objects are sufficiently similar" 
[Broder, col. 9, lines 1-3 with Broder, col. 9, lines 11-12 with Broder, col. 9, 
line 19]. 

Broder discloses the above limitations but does not expressly teach : 

• "...constructing a plurality of hash tables for the collection of objects produced by 
processing the query; 

• ...so that memory requirements are reduced while avoiding false detections 
approximately as well as in the previous method, 

• . . .compressing each of the seven supersamples to sixteen bits of precision; 
and 

• wherein the number of matching supersamples is greater than a number of 
matching supersamples required in the previous method." 

With respect to Claim 1 , an analogous art, Puqh, teaches: 

• "...constructing a plurality of hash tables for the collection of objects produced by 
processing the query; [(Pugh, col. 7, lines 49-54 with Pugh, Fig. 3 or AAPA p. 6, 
lines 2-1 0) with Broder, col. 1 1 , lines 8-1 1 with Broder, col. 1 1 , lines 20-23] 

• ...while avoiding false detections approximately as well as in the previous 
method, [Pugh, col. 3, lines 35-43] 



Application/Control Number: 10/805,805 Page 6 

Art Unit: 2161 

• ... combining four samples of features into seven supersamples; [Pugh, col. 
9, lines 29-31 with Pugh, cols. 11-12, lines 65-3 with Pugh, col. 12, lines 39- 
46 with Broder, col. 9, lines 16-22] 

• . . .wherein the number of matching supersamples is greater than a number of 
matching supersamples required in the previous method" [Pugh, col. 3, lines 
35-43 with Broder, col. 9, lines 1-3 with Broder, col. 9, lines 11-12 with 
Broder, col. 9, line 19]. 

With respect to Claim 1 , an analogous art, Sharanqpani, teaches: 
• "...so that memory requirements are reduced" [Sharangpani, col. 1, lines 22-27 
with Broder, col. 9, lines 11-15]. 

With respect to Claim 1, an analogous art, Powell, teaches: 

• "...compressing each of the seven supersamples to sixteen bits of precision" 
[Powell, col. 3, lines 35-48 with Sharangpani, col. 1, lines 22-27 with Broder, 
col. 9, lines 11-15]. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention having the teachings of Pugh, Sharangpani, Powell and Broder before him/her 
to combine Pugh, Sharangpani, and Powell with Broder because the inventions are 
directed towards, for example, computing bits in a computer or duplicate processing. 
As such the inventions are in the field of applicant's endeavor and/or are reasonably 
pertinent to the particular problem with which the applicant is concerned. 

Pugh's, Sharangpani's, and Powell's inventions would have been expected to 
successfully work well with Broder's invention because the inventions use computers 
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and signatures/fingerprints with bit computation to detect duplicates. Broder discloses a 
(previous) method for clustering closely resembling data objects comprising samples, 
supersamples, and finding similar documents. However, Broder does not explicitly 
disclose a reduction in samples to form a supersample, using a 16-bit fingerprint to 
represent a fingerprint/supersample, hash tables, and a greater number of matching 
supersamples to have objects sufficiently similar. Pugh discloses detecting duplicate 
and near-duplicate files comprising detecting duplicates using, essentially, any number 
of (matching) fingerprints where fingerprints are combined from, essentially, any number 
of samples and a form of hash tables. Sharangpani discloses a method and apparatus 
for selecting a rounding mode for a numeric operation comprising truncating (removing) 
any number of bits to a desired precision (thus reducing memory requirements). Powell 
discloses a method and system for digital image signatures comprising reduced (16) 
bits of precision for a fingerprint/signature. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention having the teachings of Pugh, Sharangpani, Powell and Broder before him/her 
to take the content of the fingerprints, matching requirements, and hash tables from 
Pugh, the removal/truncation of bits from Sharangpani, and the size of the 
fingerprints/signatures from Powell and install them into the invention of Broder, thereby 
offering the obvious advantage of a reduced memory footprint (by using smaller 
(truncated) fingerprints/signatures) and having a reduced number of false positives. 

Furthermore, it appears that the Applicant's claimed invention is a mere 
modification of numbers, parameters, and thresholds from the previous method. For 
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instance, Broder, at the very least, teaches that other ranges of numbers, variables, 
parameters, and thresholds can be used in stating that certain numbers, variables, 
parameters, and thresholds were selected on an exemplary basis (Broder, col. 8, lines 
62-67). As such, MPEP 2144.05 should be observed since the claimed invention 
appears that it is claiming an obvious optimization of ranges. Court cases of interest 
dealing with this are In re Alter, 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955), 
Peterson, 315 F.3d at 1330, 65 USPQ2d at 1382, In re Hoeschele, 406 F.2d 1403, 160 
USPQ 809 (CCPA 1969), Merck & Co. Inc. v. Biocraft Laboratories Inc., 874 F.2d 804, 
10 USPQ2d 1843 (Fed. Cir.), cert, denied, 493 U.S. 975 (1989), In re Kulling, 897 F.2d 
1 147, 14 USPQ2d 1056 (Fed. Cir. 1990), In re Geisler, 116 F.3d 1465, 43 USPQ2d 
1362 (Fed. Cir. 1997), In re Antonie, 559 F.2d 618, 195 USPQ 6 (CCPA 1977), and In 
re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980). This conclusion is also 
somewhat supported by KSR v. Teleflex in that it is "obvious to try" values to obtain a 
desired result (in this case obtaining the least number of false positives while 
conserving computer resources). 

Claim 2 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 1 , wherein requiring the number of matching 
supersamples comprises requiring at least six of the seven supersamples to match" 
[Pugh, col. 3, lines 35-43 with with Pugh, cols. 11-12 with Broder, col. 8, lines 62-67with 
Broder, col. 9, lines 1-3 with Broder, col. 9, lines 11-20 with Broder, col. 9, line 19]. 

Claim 3 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 1 , wherein requiring the number of matching 
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supersamples comprises requiring at least five of the seven supersamples to match" 
[Pugh, col. 3, lines 35-43 with with Pugh, cols. 11-12 with Broder, col. 8, lines 62-67with 
Broder, col. 9, lines 1-3 with Broder, col. 9, lines 11-20 with Broder, col. 9, line 19]. 

Claim 4 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 1 , wherein requiring the number of matching 
supersamples comprises requiring all seven supersamples to match" [Pugh, col. 3, lines 
35-43 with with Pugh, cols. 11-12 with Broder, col. 8, lines 62-67with Broder, col. 9, 
lines 1-3 with Broder, col. 9, lines 11-20 with Broder, col. 9, line 19]. 

Claim 8 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 1 , wherein the objects are documents, [Broder, 
col. 11, lines 8-11 with Broder, col. 11, lines 19-28] and the method is used in 
association with a search engine query service to determine clusters of query results 
that are near-duplicate documents" [Broder, col. 11, lines 8-1 1 with Broder, col. 11, lines 
19-28]. 

Claim 9 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 8, further comprising selecting a single 
document in each cluster to report" [Pugh, col. 10, lines 50-57 or Broder, col. 10, lines 
15-18]. 

Claim 10 can be mapped to Broder (as modified by Pugh, Sharangpani, and 
Powell) as follows: "The method of claim 9, wherein selecting the single document is by 
way of a ranking function" [Pugh, col. 10, lines 50-57]. 
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Claims 17-20 encompass substantially the same scope of the invention as that 
of Claims 1-4, respectfully, in addition to a computer-readable storage medium and 
some instructions for performing the method steps of Claims 1-4, respectfully. 
Therefore, Claims 17-20 are rejected for the same reasons as stated above with respect 
to Claims 1-4, respectfully. 

10. Claims 14-16 and 22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over U.S. Patent No. 6,349,296 (Broder et al.) in view of U.S. Patent No. 5,721 ,788 
(Powell et al.), further in view of U.S. Patent No. 6,658,423 (Pugh et al.) (or 
alternatively, Applicant admitted prior art (AAPA) for a limitation). 

For Claim 14, Broder teaches: "A method for determining groups of near- 
duplicate items [Broder, col. 4, lines 6-15 with Broder, Fig. 3] in a search engine query 
result, [Broder, col. 1 1 , lines 8-1 1 with Broder, col. 1 1 , lines 20-23] the method 
comprising. ..and, for each of two items being compared." 

Broder discloses the above limitation but does not expressly teach: "constructing 
a plurality of hash tables for the items in the search query result 

• . . .combining four samples of features into each of seven supersamples; 

• compressing each supersample to 16 bits of precision; and 

• requiring five of the seven supersamples to match." 

With respect to Claim 14. an analogous art. Pugh. teaches: "constructing a 
plurality of hash tables for the items in the search query result [(Pugh, col. 7, lines 49-54 



Application/Control Number: 10/805,805 Page 1 1 

Art Unit: 2161 

with Pugh, Fig. 3orAAPAp.6, lines 2-10) with Broder, col. 11, lines 8-11 with Broder, 
col. 11, lines 20-23] 

• . . .combining four samples of features into each of seven supersamples; [Pugh, 
col. 9, lines 29-31 with Pugh, cols. 11-12, lines 65-3 with Broder, col. 9, lines 16- 
22] 

• ...requiring five of the seven supersamples to match" [Pugh, col. 3, lines 35-43 
with Broder, col. 9, lines 11-20]. 

With respect to Claim 14, an analogous art, Powell, teaches: 

• "...compressing each supersample to 16 bits of precision" [Powell, col. 3, lines 
35-48]. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention having the teachings of Powell, Pugh and Broder before him/her to combine 
Powell and Pugh with Broder because the inventions are in the field of applicant's 
endeavor or are reasonably pertinent to the particular problem with which the applicant 
is concerned. 

Powell's and Pugh's inventions would have been expected to successfully work 
well with Broder's invention because the inventions use computers and 
signatures/fingerprints with bits to detect duplicates. Broder discloses a (previous) 
method for clustering closely resembling data objects comprising samples, 
supersamples, and finding similar documents. However, Broder does not explicitly 
disclose a reduction in samples to form a supersample, reduction in bits of precision for 
the fingerprints, hash tables, and a greater number of matching supersamples to have 
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objects sufficiently similar. Powell discloses a method and system for digital image 
signatures comprising reduced (16) bits of precision for a fingerprint. Pugh discloses 
detecting duplicate and near-duplicate files comprising detecting duplicates using, 
essentially, any number of matching fingerprints where fingerprints are combined from, 
essentially, any number of samples and a form of hash tables. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention having the teachings of Powell, Pugh and Broder before him/her to take the 
size of the fingerprints/signatures from Powell, and the content of the fingerprints and 
matching requirements from Pugh and install them into the invention of Broder, thereby 
offering the obvious advantage of a reduced memory footprint (by using smaller 
fingerprints/signatures) and having an reduced number of false positives. 

Furthermore, it appears that the Applicant's claimed invention is a mere 
modification of numbers, parameters, and thresholds from Broder's method. For 
instance, Broder, at the very least, teaches that other ranges of numbers, variables, 
parameters, and thresholds can be used in stating that certain numbers, variables, 
parameters, and thresholds were selected on an exemplary basis (Broder, col. 8, lines 
62-67). As such, MPEP 2144.05 should be observed since the claimed invention 
appears that it is claiming an obvious optimization of ranges. Court cases of interest 
regarding this are In re Alter, 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955), 
Peterson, 315 F.3d at 1330, 65 USPQ2d at 1382, In re Hoeschele, 406 F.2d 1403, 160 
USPQ 809 (CCPA 1969), Merck & Co. Inc. v. Biocraft Laboratories Inc., 874 F.2d 804, 
10 USPQ2d 1843 (Fed. Cir.), cert, dented, 493 U.S. 975 (1989), In re Kulling, 897 F.2d 
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1 147, 14 USPQ2d 1056 (Fed. Cir. 1990), In re Geisler, 116 F.3d 1465, 43 USPQ2d 
1362 (Fed. Cir. 1997), In reAntonie, 559 F.2d 618, 195 USPQ 6 (CCPA 1977), and In 
re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980). 

Claim 15 can be mapped to Broder (as modified by Powell and Pugh) as follows: 
"The method of claim 14, further comprising selecting a single document in each cluster 
to report" [Pugh, col. 10, lines 50-57 or Broder, col. 10, lines 15-18]. 

Claim 16 can be mapped to Broder (as modified by Powell and Pugh) as follows: 
"The method of Claim 1 5, wherein selecting the single document is by way of a ranking 
function" [Pugh, col. 10, lines 50-57]. 

Claim 22 encompasses substantially the same scope of the invention as that of 
Claim 14, in addition to a computer-readable storage medium and some instructions for 
performing the method steps of Claim 14. Therefore, Claim 22 is rejected for the same 
reasons as stated above with respect to Claim 14. 

1 1 . THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
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the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 
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Conclusion 

12. Any prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Applicant is advised that, although not used in the rejections 
above, prior art cited on any PTO-892 form and not relied upon is considered materially 
relevant to the applicant's claimed invention and/or portions of the claimed invention. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brent S. Stace whose telephone number is 571-272- 
8372 and fax number is 571-273-8372. The examiner can normally be reached on M-F 
9am-5:30pm EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Apu M. Mofiz can be reached on 571-272-4080. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-217-9197 (toll-free). 

IB. S.l 

Examiner, Art Unit 2161 

/Apu M Mofiz/ 

Supervisory Patent Examiner, Art Unit 2161 



